Policy Gradient Reinforcement Learning for Uncertain Polytopic LPV Systems based on MHE-MPC
نویسندگان
چکیده
In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals bounds), where Time Invariant (LTI) models (vertices) captured by combinations of becomes wrong. We first to adopt Moving Horizon Estimation (MHE) scheme simultaneously estimate convex combination vector and unmeasured states based on observations model matching error. To tackle wrong LTI used in both MPC MHE schemes, then Policy Gradient (PG) Reinforcement Learning (RL) learn estimator controller so that best closed-loop performance is achieved. The effectiveness proposed RL-based MHE/MPC design demonstrated using an illustrative example.
منابع مشابه
Model-based Policy Gradient Reinforcement Learning
Policy gradient methods based on REINFORCE are model-free in the sense that they estimate the gradient using only online experiences executing the current stochastic policy. This is extremely wasteful of training data as well as being computationally inefficient. This paper presents a new modelbased policy gradient algorithm that uses training experiences much more efficiently. Our approach con...
متن کاملInterpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. This paper examines, both theoretically and empirically, approaches to merging onand off-policy updates for deep reinforcement learning. Theoretical resu...
متن کاملA Polyhedral Off-Line Robust MPC Strategy for Uncertain Polytopic Discrete-Time Systems
In this paper, an off-line synthesis approach to robust constrained model predictive control for uncertain polytopic discrete-time systems is presented. Most of the computational burdens are moved off-line by pre-computing a sequence of state feedback control laws that corresponds to a sequence of polyhedral invariant sets. The state feedback control laws computed are derived by minimizing the ...
متن کاملRobot reinforcement learning accuracy-based learning classifier systems with Fuzzy Policy Gradient descent(XCS-FPGRL)
This paper presented a novel approach XCS-FPGRL to research on robot reinforcement learning. XCS-FPGRL combines covering operator and genetic algorithm. The systems is responsible for adjusting precision and reducing search space according to some reward obtained from the environment, acts as an innovation discovery component which is responsible for discovering new better reinforcement learnin...
متن کاملScalable Multitask Policy Gradient Reinforcement Learning
Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IFAC-PapersOnLine
سال: 2022
ISSN: ['2405-8963', '2405-8971']
DOI: https://doi.org/10.1016/j.ifacol.2022.07.599